sample 1
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Europe > Central Europe (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (2 more...)
Iterative Critique-Refine Framework for Enhancing LLM Personalization
Maram, Durga Prasad, Gandhi, Dhruvin, Yao, Zonghai, Akkinapalli, Gayathri, Dernoncourt, Franck, Wang, Yu, Rossi, Ryan A., Ahmed, Nesreen K.
Personalized text generation requires models not only to produce coherent text but also to align with a target user's style, tone, and topical focus. Existing retrieval-augmented approaches such as LaMP and PGraphRAG enrich profiles with user and neighbor histories, but they stop at generation and often yield outputs that drift in tone, topic, or style. We present PerFine, a unified, training-free critique-refine framework that enhances personalization through iterative, profile-grounded feedback. In each iteration, an LLM generator produces a draft conditioned on the retrieved profile, and a critic LLM - also conditioned on the same profile - provides structured feedback on tone, vocabulary, sentence structure, and topicality. The generator then revises, while a novel knockout strategy retains the stronger draft across iterations. We further study additional inference-time strategies such as Best-of-N and Topic Extraction to balance quality and efficiency. Across Yelp, Goodreads, and Amazon datasets, PerFine consistently improves personalization over PGraphRAG, with GEval gains of +7-13%, steady improvements over 3-5 refinement iterations, and scalability with increasing critic size. These results highlight that post-hoc, profile-aware feedback offers a powerful paradigm for personalized LLM generation that is both training-free and model-agnostic.
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Oregon (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- (4 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Central Europe (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (2 more...)
Graph Coloring via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction
The pseudocode for the NeurHap-refine is as follows: Algorithm 1: The Local Refinement Algorithm NeurHap-refine. Two categories of datasets are used in the paper, Polyploid species and Viral Quasispecies . BW A-MEM [Li, 2013] is used to align reads to the reference genome. The detailed command is (take the 15-strain ZIKV as an example): $ ./bwa Vikalo, 2020a,b] to derive the SNP matrix from the above alignment to ensure a fair comparison.
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
LinkQA: Synthesizing Diverse QA from Multiple Seeds Strongly Linked by Knowledge Points
Zhang, Xuemiao, Ren, Can, Tu, Chengying, Weng, Rongxiang, Yan, Hongfei, Wang, Jingang, Cai, Xunliang
The advancement of large language models (LLMs) struggles with the scarcity of high-quality, diverse training data. To address this limitation, we propose LinkSyn, a novel knowledge point (KP) graph-based synthesis framework that enables flexible control over discipline and difficulty distributions while balancing KP coverage and popularity. LinkSyn extracts KPs from question-answering (QA) seed data and constructs a KP graph to synthesize diverse QA data from multiple seeds strongly linked by KPs and sampled from graph walks. Specifically, LinkSyn incorporates (1) a knowledge distribution value function to guide the adjustment of path sampling probability and balance KP coverage and popularity during graph walks; (2) diffusion-based synthesis via DeepSeek-R1 by leveraging multiple seeds with dense logical associations along each path; and (3) high-difficulty QA enhancement within given disciplines by flexible difficulty adjustments. By executing LinkSyn, we synthesize LinkQA, a diverse multi-disciplinary QA dataset with 50B tokens. Extensive experiments on Llama-3 8B demonstrate that continual pre-training with LinkQA yields an average improvement of 11.51% on MMLU and CMMLU, establishing new SOT A results. LinkQA consistently enhances performance across model size and initial FLOPs scales.
- North America > United States (0.45)
- Europe > Austria > Vienna (0.14)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Europe > Netherlands > South Holland > Leiden (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Education > Educational Setting (1.00)
- Government > Military (0.93)
- (3 more...)
Large-Scale Diverse Synthesis for Mid-Training
Zhang, Xuemiao, Tu, Chengying, Ren, Can, Weng, Rongxiang, Yan, Hongfei, Wang, Jingang, Cai, Xunliang
The scarcity of high-quality, knowledge-intensive training data hinders the development of large language models (LLMs), as traditional corpora provide limited information. Previous studies have synthesized and integrated corpora-dependent question-answering (QA) data to improve model performance but face challenges in QA data scalability and knowledge diversity, particularly in cross-domain contexts. Furthermore, leveraging our designed discipline and difficulty annotation system, we probe model deficiencies in STEM disciplines and high-difficulty data. To overcome these limitations, we propose a novel diversified pipeline to synthesize BoostQA, a 100B-token large-scale QA dataset. Our synthesis framework: (1) curates seed data from heterogeneous sources; (2) utilizes DeepSeek-R1 to implement STEM-focused multi-grade synthesis to boost data diversity and high-difficulty synthesis to mitigate difficulty degradation; (3) refines answers via DeepSeek-V3 to improve output quality. We utilize BoostQA in mid-training, a mid-stage between pre-training and post-training, to optimize domain-specific knowledge acquisition and enhance data quality. Our method enables Llama-3 8B, mid-trained on a 40B-token dataset, to achieve an average improvement of 12.74% on MMLU and CMMLU and establish SOT A average performance across 12 benchmarks. BoostQA also demonstrates robust scalability, with performance consistently improving as model size, data volume, and initial FLOPs scale.
- North America > United States (0.14)
- Europe > Austria > Vienna (0.14)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East > Jordan (0.04)
- Education > Educational Setting (0.93)
- Information Technology (0.67)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
- Banking & Finance > Loans > Mortgages (0.46)
Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data
Zhang, Xuemiao, Xu, Liangyu, Duan, Feiyu, Zhou, Yongwei, Wang, Sirui, Weng, Rongxiang, Wang, Jingang, Cai, Xunliang
Large language models (LLMs) generally utilize a consistent data distribution throughout the pretraining process. However, as the model's capability improves, it is intuitive that its data preferences dynamically change, indicating the need for pretraining with different data at various training stages. To achieve it, we propose the Perplexity Difference (PD) based Preference Curriculum learning (PDPC) framework, which always perceives and uses the data preferred by LLMs to train and boost them. First, we introduce the PD metric to quantify the difference in how challenging a sample is for weak versus strong models. Samples with high PD are more challenging for weak models to learn and are more suitable to be arranged in the later stage of pretraining. Second, we propose the preference function to approximate and predict the data preference of the LLM at any training step, so as to complete the arrangement of the dataset offline and ensure continuous training without interruption. Experimental results on 1.3B and 3B models demonstrate that PDPC significantly surpasses baselines. Notably, the 3B model trained on 1T tokens achieves an increased average accuracy of over 8.1% across MMLU and CMMLU.
- North America > United States > New York (0.04)
- North America > Canada > Newfoundland and Labrador > Newfoundland (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine (0.68)
- Energy (0.47)
- Leisure & Entertainment > Sports (0.46)
Cross-Modal Consistency in Multimodal Large Language Models
Zhang, Xiang, Li, Senyu, Shi, Ning, Hauer, Bradley, Wu, Zijun, Kondrak, Grzegorz, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S.
Recent developments in multimodal methodologies have marked the beginning of an exciting era for models adept at processing diverse data types, encompassing text, audio, and visual content. Models like GPT-4V, which merge computer vision with advanced language processing, exhibit extraordinary proficiency in handling intricate tasks that require a simultaneous understanding of both textual and visual information. Prior research efforts have meticulously evaluated the efficacy of these Vision Large Language Models (VLLMs) in various domains, including object detection, image captioning, and other related fields. However, existing analyses have often suffered from limitations, primarily centering on the isolated evaluation of each modality's performance while neglecting to explore their intricate cross-modal interactions. Specifically, the question of whether these models achieve the same level of accuracy when confronted with identical task instances across different modalities remains unanswered. In this study, we take the initiative to delve into the interaction and comparison among these modalities of interest by introducing a novel concept termed cross-modal consistency. Furthermore, we propose a quantitative evaluation framework founded on this concept. Our experimental findings, drawn from a curated collection of parallel vision-language datasets developed by us, unveil a pronounced inconsistency between the vision and language modalities within GPT-4V, despite its portrayal as a unified multimodal model. Our research yields insights into the appropriate utilization of such models and hints at potential avenues for enhancing their design.
- North America > Canada > Alberta (0.14)
- North America > Canada > British Columbia (0.04)
- Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
Multiply-Robust Causal Change Attribution
Quintas-Martinez, Victor, Bahadori, Mohammad Taha, Santiago, Eduardo, Mu, Jeff, Janzing, Dominik, Heckerman, David
Comparing two samples of data, we observe a change in the distribution of an outcome variable. In the presence of multiple explanatory variables, how much of the change can be explained by each possible cause? We develop a new estimation strategy that, given a causal model, combines regression and re-weighting methods to quantify the contribution of each causal mechanism. Our proposed methodology is multiply robust, meaning that it still recovers the target parameter under partial misspecification. We prove that our estimator is consistent and asymptotically normal. Moreover, it can be incorporated into existing frameworks for causal attribution, such as Shapley values, which will inherit the consistency and large-sample distribution properties. Our method demonstrates excellent performance in Monte Carlo simulations, and we show its usefulness in an empirical application.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > Mexico > Oaxaca (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)